Timothy W Russell*, Joel Hellewell, Sam Abbott, Nick Holding, Hamish Gibbs, Christopher I Jarvis, Kevin Van Zandvoort, CMMID COVID-19 working group, Stefan Flasche, Rosalind M Eggo, W John Edmunds, Adam J Kucharski

authors contributed equally

* corresponding author

Last Updated: 2020-04-07

Aim

To estimate the percentage of symptomatic COVID-19 cases reported in different countries using case fatality ratio estimates based on data from the ECDC, correcting for delays between confirmation-and-death.

Methods Summary

Current estimates for percentage of symptomatic cases reported for countries with greater than ten deaths

Temporal variation

_Figure 1: Temporal variation in reporting rate. We calculate the percentage of cases reported on each day a country has had more than ten deaths. We then fit a Generalised Additive Model to these data (see Temporal variation model fitting section for details), highlighting the temporal trend of each countries reporting rate. The red shaded region is the 95% CI of fitted GAM._

Figure 1: Temporal variation in reporting rate. We calculate the percentage of cases reported on each day a country has had more than ten deaths. We then fit a Generalised Additive Model to these data (see Temporal variation model fitting section for details), highlighting the temporal trend of each countries reporting rate. The red shaded region is the 95% CI of fitted GAM.

Current estimates

Figure 2: Plotting the estimates for the proportion of symptomatic cases reported in different countries using cCFR estimates. Dark blue shading is the 25% - 75% confidence range and light blue is 2.5% - 97.5% confidence range

Table of current estimates

Country Percentage of cases reported (95% CI) Total cases Total deaths
Albania 11% (6% - 21%) 259 15
Algeria 5.2% (3.7% - 7.3%) 847 58
Andorra 16% (8.6% - 31%) 390 14
Argentina 18% (11% - 29%) 1133 31
Australia 100% (80% - 100%) 4976 21
Austria 44% (34% - 58%) 10711 146
Belgium 8% (6.7% - 9.5%) 13964 828
Bosnia and Herzegovina 16% (8.4% - 32%) 464 13
Brazil 12% (9.4% - 15%) 6836 241
Burkina Faso 9.3% (5.2% - 18%) 261 14
Canada 36% (27% - 48%) 9595 109
Chile 86% (47% - 100%) 3031 16
China 33% (29% - 38%) 82395 3316
Colombia 28% (16% - 53%) 1065 17
Czech Republic 51% (33% - 79%) 3589 39
Democratic Republic of the Congo 4.7% (2.6% - 9.6%) 123 11
Denmark 19% (14% - 26%) 3107 104
Dominican Republic 8.5% (6% - 12%) 1284 57
Ecuador 9% (6.9% - 12%) 2758 146
Egypt 9.9% (6.8% - 15%) 710 46
Finland 56% (31% - 100%) 1446 17
France 7.3% (6.3% - 8.3%) 56989 4032
Germany 47% (39% - 56%) 73522 872
Greece 18% (13% - 27%) 1375 50
Honduras 5.5% (3.2% - 10%) 219 14
Hungary 15% (8.7% - 26%) 525 20
India 15% (10% - 22%) 1965 50
Indonesia 5.6% (4.4% - 7.2%) 1677 157
Iran 11% (9.2% - 12%) 47593 3036
Iraq 8.3% (5.8% - 12%) 694 50
Ireland 21% (15% - 29%) 3447 85
Israel 100% (73% - 100%) 5591 21
Italy 6.2% (5.4% - 6.9%) 110574 13157
Japan 28% (19% - 41%) 2178 57
Lebanon 29% (15% - 63%) 479 12
Luxembourg 46% (29% - 76%) 2319 29
Malaysia 43% (29% - 66%) 2908 45
Mexico 17% (11% - 26%) 1378 37
Morocco 7.3% (5% - 11%) 654 39
Netherlands 6.4% (5.4% - 7.4%) 13614 1173
Norway 100% (64% - 100%) 4665 32
Pakistan 37% (23% - 60%) 2291 31
Panama 20% (13% - 32%) 1317 32
Peru 12% (8.5% - 19%) 1323 47
Philippines 9.3% (6.9% - 12%) 2311 96
Poland 29% (19% - 44%) 2554 43
Portugal 20% (15% - 25%) 8251 187
Puerto Rico 5% (2.8% - 10%) 286 11
Romania 13% (9.4% - 18%) 2460 85
Russia 39% (23% - 67%) 2777 24
Saudi Arabia 57% (32% - 100%) 1720 16
Serbia 19% (11% - 32%) 1060 23
Slovenia 40% (22% - 78%) 841 15
South Korea 68% (52% - 88%) 9976 169
Spain 6% (5.2% - 6.8%) 102136 9053
Sweden 12% (9.7% - 15%) 4947 239
Switzerland 28% (23% - 34%) 17070 378
Thailand 88% (45% - 100%) 1771 12
Turkey 17% (14% - 22%) 15679 277
Ukraine 10% (6.2% - 18%) 794 20
United Kingdom 4.9% (4.2% - 5.6%) 29474 2532
United States of America 17% (15% - 19%) 216721 5138

Table 1: Estimates for the proportion of symptomatic cases reported in different countries using cCFR estimates based on case and death timeseries data from the ECDC. Total cases and deaths in each country is also shown. Confidence intervals calculated using an exact binomial test with 95% significance.

Adjusting for outcome delay in CFR estimates

During an outbreak, the naive CFR (nCFR), i.e. the ratio of reported deaths date to reported cases to date, will underestimate the true CFR because the outcome (recovery or death) is not known for all cases [5]. We can therefore estimate the true denominator for the CFR (i.e. the number of cases with known outcomes) by accounting for the delay from confirmation-to-death [1].

We assumed the delay from confirmation-to-death followed the same distribution as estimated hospitalisation-to-death, based on data from the COVID-19 outbreak in Wuhan, China, between the 17th December 2019 and the 22th January 2020, accounting right-censoring in the data as a result of as-yet-unknown disease outcomes (Figure 1, panels A and B in [7]). The distribution used is a Lognormal fit, has a mean delay of 13 days and a standard deviation of 12.7 days [7].

To correct the CFR, we use the case and death incidence data to estimate the proportion of cases with known outcomes [1,6]:

\[ u_{t} = \frac{ \sum_{j = 0}^{t} c_{t-j} f_j}{c_t}, \]

where \(u_t\) represents the underestimation of the proportion of cases with known outcomes [1,5,6] and is used to scale the value of the cumulative number of cases in the denominator in the calculation of the cCFR, \(c_{t}\) is the daily case incidence at time, \(t\) and \(f_t\) is the proportion of cases with delay of \(t\) between confirmation and death.

Approximating the proportion of symptomatic cases reported

At this stage, raw estimates of the CFR of COVID-19 correcting for delay to outcome, but not under-reporting, have been calculated. These estimates range between 1% and 1.5% [1–3]. We assume a CFR of 1.4% (95% CrI: 1.2-1.7%), taken from a recent large study [3], as a baseline CFR. We use it to approximate the potential level of under-reporting in each country. Specifically, we perform the calculation \(\frac{1.4\%}{\text{cCFR}}\) of each country to estimate an approximate fraction of cases reported.

Temporal variation model fitting

We estimate the level of under-reporting on every day for each country that has had more than ten deaths. We then fit a Generalised Additive Model (GAM) of the form \[ \mathbb{E}[\log(D)] = \beta_0 + \beta_1 x_1 + ... + \beta_p x_p,\] specifying a Poisson distribution on deaths (D) as the response variable. The model has a log-link function and a log-offset (\(\kappa\)) consisting of the daily known-outcomes \(u_t\) and the cCFR estimate for that country on that day \(\text{cfr}_t\). The model can then be written as \[ D \sim s(t) + \underbrace{\log(u_t c_t) + \log(\text{cfr}_t)}_{:=log(κ)} \] where \(s(t)\) is a smoothing spline, fitted through the time points (days) for which we have data.

Limitations

Implicit in assuming that the under-reporting is \(\frac{1.4\%}{\text{cCFR}}\) for a given country is that the deviation away from the assumed 1.4% CFR is entirely down to under-reporting. In reality, burden on healthcare system is a likely contributing factor to higher than 1.4% CFR estimates, along with many other country specific factors.

The following is a list of the other prominent assumptions made in our analysis:

Code and data availability

The code is publically available at https://github.com/thimotei/CFR_calculation. The data required for this analysis is a time-series for both cases and deaths, along with the corresponding delay distribution. We scrape this data from ECDC, using the NCoVUtils package [8].

References

1 Russell TW, Hellewell J, Jarvis CI et al. Estimating the infection and case fatality ratio for covid-19 using age-adjusted data from the outbreak on the diamond princess cruise ship. medRxiv 2020.

2 Verity R, Okell LC, Dorigatti I et al. Estimates of the severity of covid-19 disease. medRxiv 2020.

3 Guan W-j, Ni Z-y, Hu Y et al. Clinical characteristics of coronavirus disease 2019 in china. New England Journal of Medicine 2020.

4 Shim E, Mizumoto K, Choi W et al. Estimating the risk of covid-19 death during the course of the outbreak in korea, february-march, 2020. medRxiv 2020.

5 Kucharski AJ, Edmunds WJ. Case fatality rate for ebola virus disease in west africa. The Lancet 2014;384:1260.

6 Nishiura H, Klinkenberg D, Roberts M et al. Early epidemiological assessment of the virulence of emerging infectious diseases: A case study of an influenza pandemic. PLoS One 2009;4.

7 Linton NM, Kobayashi T, Yang Y et al. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data. Journal of Clinical Medicine 2020;9:538.

8 Abbott S MJ Hellewell J. NCoVUtils: Utility functions for the 2019-ncov outbreak. doi:105281/zenodo3635417 2020.